Search CORE

72 research outputs found

Best Entry Pages for the Topic Distillation Task

Author: Lalmas Mounia
Tsikrika Theodora
Publication venue
Publication date: 30/12/2013
Field of study

The Wikipedia Image Retrieval Task

Author: Kludas J.
Tsikrika T. (Theodora)
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2010
Field of study

The wikipedia image retrieval task at ImageCLEF provides a testbed for the system-oriented evaluation of visual information retrieval from a collection of Wikipedia images. The aim is to investigate the effectiveness of retrieval approaches that exploit textual and visual evidence in the context of a large and heterogeneous collection of images that are searched for by users with diverse information needs. This chapter presents an overview of the available test collections, summarises the retrieval approaches employed by the groups that participated in the task during the 2008 and 2009 ImageCLEF campaigns, provides an analysis of the main evaluation results, identifies best practices for effective retrieval, and discusses open issues

CWI's Institutional Repository

Overview of the wikipediaMM task at ImageCLEF 2008

Author: Kludas J.
Tsikrika T. (Theodora)
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/09/2009
Field of study

The wikipediaMM task provides a testbed for the system-oriented evaluation of ad-hoc retrieval from a large collection of Wikipedia images. It became a part of the ImageCLEF evaluation campaign in 2008 with the aim of investigating the use of visual and textual sources in combination for improving the retrieval performance. This paper presents an overview of the task¿s resources, topics, assessments, participants' approaches, and main results

CWI's Institutional Repository

Hybrid focused crawling on the Surface and the Dark Web

Author: Iliou Christos
Kalpakis George
Kompatsiaris Ioannis
Tsikrika Theodora
Vrochidis Stefanos
Publication venue
Publication date: 01/07/2017
Field of study

Focused crawlers enable the automatic discovery of Web resources about a given topic by automatically navigating through the Web link structure and selecting the hyperlinks to follow by estimating their relevance to the topic of interest. This work proposes a generic focused crawling framework for discovering resources on any given topic that reside on the Surface or the Dark Web. The proposed crawler is able to seamlessly navigate through the Surface Web and several darknets present in the Dark Web (i.e., Tor, I2P, and Freenet) during a single crawl by automatically adapting its crawling behavior and its classifier-guided hyperlink selection strategy based on the destination network type and the strength of the local evidence present in the vicinity of a hyperlink. It investigates 11 hyperlink selection methods, among which a novel strategy proposed based on the dynamic linear combination of a link-based and a parent Web page classifier. This hybrid focused crawler is demonstrated for the discovery of Web resources containing recipes for producing homemade explosives. The evaluation experiments indicate the effectiveness of the proposed focused crawler both for the Surface and the Dark Web

ZENODO

Directory of Open Access Journals

Reliability and effectiveness of clickthrough data for automatic image annotation

Author: Delopoulos A.
Diou C.
Tsikrika T. (Theodora)
Vries A.P. (Arjen) de
Publication venue
Publication date: 01/08/2010
Field of study

Automatic image annotation using supervised learning is performed by concept classifiers trained on labelled example images. This work proposes the use of clickthrough data collected from search logs as a source for the automatic generation of concept training data, thus avoiding the expensive manual annotation effort. We investigate and evaluate this approach using a collection of 97,628 photographic images. The results indicate that the contribution of search log based training data is positive despite their inherent noise; in particular, the combination of manual and automatically generated training data outperforms the use of manual data alone. It is therefore possible to use clickthrough data to perform large-scale image annotation with little manual annotation effort or, depending on performance, using only the automatically generated training data. An extensive presentation of the experimental results and the accompanying data can be accessed at http://olympus.ee.auth.gr/~diou/civr2009/

CWI's Institutional Repository

VITALAS at TRECVID-2008

Author: Aly Robin
Delopoulos Anastasios
Dimitriou Nikos
Diou Christos
Panagiotopoulos Panagiotis
Papachristou Christos
Rode Henning
Stephanopoulos George
Tsikrika Theodora
Vries Arjen P. de
Publication venue: National Institute of Standards and Technology, NIST
Publication date: 01/11/2008
Field of study

In this paper, we present our experiments in TRECVID 2008 about High-Level feature extraction task. This is the first year for our participation in TRECVID, our system adopts some popular approaches that other workgroups proposed before. We proposed 2 advanced low-level features NEW Gabor texture descriptor and the Compact-SIFT Codeword histogram. Our system applied well-known LIBSVM to train the SVM classifier for the basic classifier. In fusion step, some methods were employed such as the Voting, SVM-base, HCRF and Bootstrap Average AdaBoost(BAAB)

CWI's Institutional Repository

University of Twente Research Information

Measuring and Analyzing the Scholarly Impact of Experimental Evaluation Initiatives

Author: Angelini Marco
Ferro Nicola
Larsen Birger
Müller Henning
Santucci Giuseppe
Silvello Gianmaria
Tsikrika Theodora
Publication venue: The Authors. Published by Elsevier B.V.
Publication date
Field of study

AbstractEvaluation initiatives have been widely credited with contributing highly to the development and advancement of information access systems, by providing a sustainable platform for conducting the very demanding activity of comparable experimental evaluation in a large scale. Measuring the impact of such benchmarking activities is crucial for assessing which of their aspects have been successful, which activities should be continued, enforced or suspended and which research paths should be further pursued in the future. This work introduces a framework for modeling the data produced by evaluation campaigns, a methodology for measuring their scholarly impact, and tools exploiting visual analytics to analyze the outcomes

Elsevier - Publisher Connector

Not All Scale-Free Networks Are Born Equal: The Role of the Seed Graph in PPI Network Evolution

Author: Aly R.
Hiemstra D.
Rode H. (Henning)
Serdyukov P.
Tsikrika T. (Theodora)
Vries A.P. (Arjen) de
Westerveld T.H.W. (Thijs)
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

The (asymptotic) degree distributions of the best-known “scale-free” network models are all similar and are independent of the seed graph used; hence, it has been tempting to assume that networks generated by these models are generally similar. In this paper, we observe that several key topological features of such networks depend heavily on the specific model and the seed graph used. Furthermore, we show that starting with the “right” seed graph (typically a dense subgraph of the protein–protein interaction network analyzed), the duplication model captures many topological features of publicly available protein–protein interaction networks very well

Public Library of Science (PLOS)

CiteSeerX

Crossref

CWI's Institutional Repository

Directory of Open Access Journals

PubMed Central

Simon Fraser University Institutional Repository

University of Twente Research Information

Deploying Semantic Web Technologies for Information Fusion of Terrorism-related Content and Threat Detection on the Web

Author: Akhgar Babak
Day Tony
Gibson Helen
Kalpakis George
Kompatsiaris Ioannis
Kontopoulos Efstratios
Mitzias Panagiotis
Staite James
Tsikrika Theodora
Vrochidis Stefanos
Publication venue: ACM Press
Publication date: 01/01/2019
Field of study

The Web and social media nowadays play an increasingly significant role in spreading terrorism-related propaganda and content. In order to deploy counterterrorism measures, authorities rely on automated systems for analysing text, multimedia, and social media content on the Web. However, since each of these systems is an isolated solution, investigators often face the challenge of having to cope with a diverse array of heterogeneous sources and formats that generate vast volumes of data. Semantic Web technologies can alleviate this problem by delivering a toolset of mechanisms for knowledge representation, information fusion, semantic search, and sophisticated analyses of terrorist networks and spatiotemporal information. In the Semantic Web environment, ontologies play a key role by offering a shared, uniform model for semantically integrating information from multimodal heterogeneous sources. An additional benefit is that ontologies can be augmented with powerful tools for semantic enrichment and reasoning. This paper presents such a unified semantic infrastructure for information fusion of terrorism-related content and threat detection on theWeb. The framework is deployed within the TENSOR EU-funded project, and consists of an ontology and an adaptable semantic reasoning mechanism. We strongly believe that, in the short- and long-term, these techniques can greatly assist Law Enforcement Agencies in their investigational operations

Crossref

ZENODO

Sheffield Hallam University Research Archive

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

MLM: A Benchmark Dataset for Multitask Learning with Multiple Languages and Modalities

Author: Ali Mehdi
Becker Tilman
Devlin Jacob
Harrach Mouna
Hu Junjie
Nguyen Binh D
Ramisa Arnau
Shankar Shreya
Tsikrika Theodora
Zhou Bolei
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/09/2020
Field of study

In this paper, we introduce the MLM (Multiple Languages and Modalities) dataset - a new resource to train and evaluate multitask systems on samples in multiple modalities and three languages. The generation process and inclusion of semantic data provide a resource that further tests the ability for multitask systems to learn relationships between entities. The dataset is designed for researchers and developers who build applications that perform multiple tasks on data encountered on the web and in digital archives. A second version of MLM provides a geo-representative subset of the data with weighted samples for countries of the European Union. We demonstrate the value of the resource in developing novel applications in the digital humanities with a motivating use case and specify a benchmark set of tasks to retrieve modalities and locate entities in the dataset. Evaluation of baseline multitask and single task systems on the full and geo-representative versions of MLM demonstrate the challenges of generalising on diverse data. In addition to the digital humanities, we expect the resource to contribute to research in multimodal representation learning, location estimation, and scene understanding

arXiv.org e-Print Archive

Crossref